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Abstract. We show that if a numerical method is posed as a sequence of 
operators acting on data and depending on a parameter, typically a measure 
of the size of discretization, then consistency, convergence and stability can be 
related by a Lax-Richtmyer type equivalence theorem - a consistent method 
is convergent if and only if it is stable. We define consistency as convergence 
on a dense subspace and stability as discrete well-poscdness. In some ap- 
plications convergence is harder to prove than consistency or stability since 
convergence requires knowledge of the solution. An equivalence theorem can 
be useful in such settings. We give concrete instances of equivalence theorems 
for polynomial interpolation, numerical differentiation, numerical integration 
using quadrature rules and Monte Carlo integration. 



1. Introduction 

For a numerical method the three most important aspects are its consistency, 
convergence and stability. These three were related in the well known equivalence 
theorem of Lax and Richtmycr for finite difference methods for certain partial differ- 
ential equations |f3j . We show that in a very general setting of numerical methods, 
in which a numerical method is posed as a family of operators acting on data, there 
is a Lax-Richtmyer type equivalence theorem : a consistent method is convergent 
if and only if it is stable. After proving the theorem in two general settings (Theo- 
rem 13.31 and Theorem [33]), we prove it in specific instances of numerical integration 
using quadrature, numerical integration using Monte Carlo methods, numerical 
differentiation and polynomial interpolation. 

Consistency is a measure of how good the discretization is. Roughly, it says that 
the discretization is close to the smooth operator in some sense. If the discrete 
solution converges to the smooth solution then the numerical method is said to 
be convergent. Note that for discussing consistency and convergence one needs 
some information about the smooth problem and the smooth solution. However, 
numerical stability is purely a property of the discrete scheme. Roughly, stability 
means that the propagated error is controlled by the error in the data. Hence there 
is a similarity between numerical stability and well-posedness. 

In practice, convergence can be the hardest to prove among consistency, conver- 
gence and stability, since the actual solution is usually not known. Hence equiva- 
lence theorems can be useful in such situations. Equivalence theorems essentially 
say that we need not worry about the convergence while solving a problem nu- 
merically as long as its discretization is consistent with the smooth problem and 
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the discrete scheme is stable. In addition, such theorems also show that unstable 
schemes will not converge for some data. These are the same advantages that are 
often appreciated in the setting of the classical Lax-Richtmyer equivalence theorem 
for finite difference schemes (see [16], page 32). 

After the preliminaries in the next section, in Section [3] we define consistency, 
convergence and stability in a general context of operators acting on data and we 
then prove equivalence theorems in this setting. The three sections that follow spe- 
cialize these notions to specific classes of basic numerical methods. The convergence 
and stability theory for these example areas are well understood, but equivalence 
theorems have not been discussed in these areas in the literature. 

As given here, the equivalence theorems in these example areas serve only as 
concrete instances for illustration of the main ideas. We make no claims that these 
examples have direct practical importance in numerical analysis. However, with 
proper generalizations, the ideas might be of use in practical situations. For ex- 
ample, convergence of multidimensional interpolation can be related to its stability 
and consistency, as we sketch in Section |6j Until now the advantages of equivalence 
theorem have been limited to finite difference methods. We suggest that similar 
benefits may be possible in many areas of numerical analysis. 

2. Preliminaries 

For the convenience of the reader, we state some definitions and theorems used 
later on in the paper. Uniform boundedness principle is one of the fundamental 
building blocks of functional analysis and it is useful for proving equivalence theo- 
rems in the linear operator setting. It says that a sequence of pointwise bounded 
continuous linear operators defined on a complete normcd linear space are uniformly 
bounded. 

Theorem 2.1 (Uniform Boundedness Principle). Let {i^ e /} be a set of bounded 
linear operators from a Banach space V to a normed linear space W , where I is 
an arbitrary set. Assume for every v £ V, the set {Fi(v)} is bounded. Then 
su P iG / \\ F i\\ < °°. 

Proof. See pQ or [B]. The main ingredient of the proof is Baire category theorem. □ 

The next lemma will be used in Section [BJ for proving that polynomial interpo- 
lation operators are bounded. 

Lemma 2.2. Let V and W be normed linear spaces over R, where W is finite 
dimensional. Let T : V — > W be a surjective linear operator with a closed kernel. 
Then T is continuous. 

Proof. Since K — T _1 (0) is closed, V/K is a normed linear space (see for instance 
Theorem 4.2 on page 70 in [6j). For v £ V the norm in V/K is defined as usual to 
be \\v + K\\v/k '■— inf{||w + fc|| s.t. k £ K}, which is the distance of v from K. Let 
T : V/K — > W be the unique linear map such that T = Ton, where it : V — > V/K is 
the quotient map. Recall that the quotient map is continuous. Note also that T is 
a bijection. Then since T is a linear bijection between V/K and W, and W is finite 
dimensional, we have that V/K is finite dimensional. Thus T is continuous (see 
page 56 of pQ ) . Hence T is continuous since it is the composition of two continuous 
functions. □ 
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Remark 2.3. Note that all that one needs above is that the image of T in W be 
finite dimensional and the kernel be closed. The surjectivity of T is not required in 
that case. 

The remaining part of this section deals with some basic facts about probability 
theory which are needed when we discuss Monte Carlo integration. These are not 
needed for the general equivalence theorems of Section [3] or other sections with the 
exception of Section 14.21 Let (f2, E, P) denote a probability space [19] where f2 
is the space of outcomes, E is a cr-algebra of subsets of fi, and P is a countably 
additive probability measure on E. 

Definition 2.4. A random variable X : — > R is a measurable function, i.e., for 
every Borel set Bel, X~ 1 {B) E E. 

The probability measure a = PX~ X defined on R is called the distribution of 
X. We will assume that the random variable X is continuous, i.e., there exists 
a nonnegative function f(x), called the probability density function (pdf) of the 
random variable X, defined on R such that for all Borel sets A C R 



The mean or the expectation of the random variable X if it exists is E(X) = 
J R xf(x)dx. 

Definition 2.5. Two random variables X, Y are said to be independent if 

P[lo : X{uj) E A, Y{uS) EB]= P[lo : X(tu) E A]P[uj : Y{u) E B] , 
for all Borel sets A, B. 

Theorem 2.6 (Strong Law of Large Numbers). Let X\, X 2 , ■ ■ ■ , X n , . . . be 

a sequence of independent and identically distributed random variables with finite 
mean fi. Then 



Theorem 2.7. Let X,Y be two independently and identically distributed random 
variables. If f is a Borel measurable function [15] on R, then the random variables 
f(X),f(Y) are independent and identically distributed. 

Proof. Let A, B be any Borel sets in R. 



P [u : /(A») E A, f(Y(uj)) eB]=P[lo: X{lo) E f-\A),Y(u,) E r\B)] 





Proof. See [3]. 



□ 



= P[u: X(u>) E f-\A)] P [w : Y(u>) E f-\B)] 
= P[u: f(X(u>)) EA]P[u;: f(Y(cj)) E B] . 

Hence f(X), f(Y) are independent. 

Let a be the distribution measure of both X and Y, i.e., 

a(A) =P[uj: X(l>) 6 A] = P [w : Y(uS) E A] . 

Let Pi (P2) be the distribution of f(X) (f(Y) respectively). Therefore, 

p 1 (A) =P[ U : f{X{u)) € A] = P [w : X(w) E r\A)} - a(f-\A)) , 
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(3 2 (A) =P[ U: f(X(w)) e A] = P [ w : Y{u>) G /^(A)] = a{J~\A)) . 
Hence ft = ft(= a/" 1 ). □ 

Remark 2.8. We need the condition of Borel measurability on / because for every 
Borel set A, we want f _1 A to be a Borel set so that X~ 1 (f~ 1 A) C S. Otherwise, 
if / is not a Borel measurable function, then f~ x A need not be a Borel measurable 
set and then X~ 1 (f~ 1 A) would not be in S. 



3. Consistency, Convergence and Stability 

If a smooth problem can be formulated as an operator applied to data, then 
the discretization can usually be formulated as a family of operators depending on 
some discretization parameter, typically a measure of the mesh size. For example, 
the parameter might be a measure of the distance between nodes in quadrature, or 
between interpolation points in polynomial interpolation, etc. 

The smooth and discrete operators can be made to act on the same space. In 
some cases this may be done for example, by considering continuous functions 
instead of discrete data. We then define the discrete scheme to be convergent if 
the discrete operators converge to the smooth operator pointwise on the entire 
space. We define consistency to be convergence on a dense subspace. If the discrete 
operators are bounded linear then the definition of stability is uniform boundedness 
of the family of discrete operators. However, when the discrete operators are general 
nonlinear operators, we define stability as asymptotic pointwise boundedness of the 
family of operators. The precise definitions are given below in Definition l3.ll and l3.41 

The two definitions of stability lead to two different proofs for the equivalence 
theorem, both of which appear in this section. Theorem 13.31 is a Lax-Richtmyer 
type equivalence theorem applicable to general numerical analysis problems when 
the discrete operators involved are bounded linear. When this condition is dropped, 
we get Theorem l3.5l Specific incarnations of the linear case theorem are proved later 
in the context of numerical integration (Theorems 14. 3| . numerical differentiation 
(Theorem 15. 2p . and polynomial interpolation (Theorem 16. 5 j) . We treat the general 
problem of Monte Carlo integration without the assumption of linearity This leads 
to a proof of an equivalence theorem (Theorem l4.8j) analogous to the nonlinear case 
but with a probabilistic flavor. 

In the classical Lax-Richtmyer equivalence theorem for partial differential equa- 
tions, the main ingredient in the proof of stability implying convergence is es- 
sentially triangle inequality and a density argument [13]. Similarly, to prove the 
equivalence theorems in this paper, the main tools we use are uniform boundedness 
principle, triangle inequality and some density arguments. 

Let V be a Banach space and W a normed linear space. Let h € (0, 1). The 
upper limit of h is not relevant because we will be considering limits as h — > 0. Let 
T, Th ■ V — > W be set of bounded linear operators. 

Definition 3.1 (Linear Discrete Operator Case). If lim^o \\{Th — T)v\\ = for 
every v £ V, then Th is said to converge to T, and if lim^_>o \\{Th — T)v\\ = for 
every v G Vq, where Vb is a dense subspace of V, then Th is said to be consistent 
with T. If sup h \\Th\\ < oo then Th is called stable. 

Our definition of consistency is motivated by the definition of consistency for fi- 
nite difference schemes for certain PDEs. For a partial differential equation Pu = f, 
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where u is the unknown, and a finite difference scheme Pk,hV = f, the finite differ- 
ence scheme is called consistent if for any smooth function <fi(t, x), P<fi — Pk,h<t> —* 
as k, h — > 0. Here k and h are a measure of the space and time meshes respec- 
tively. The convergence is pointwise convergence at every (t,x). This definition is 
from [16] . Although density is not explicitly mentioned in this definition, note that 
smooth functions are dense in the typical function spaces in which the solutions 
live. 

Remark 3.2. Observe that by our definition of consistency, convergence implies 
consistency. However, there are other definitions of consistency under which there 
exist inconsistent schemes which converge. See for instance, |21] and Example 1.4.3 
in [16] in the context of finite difference schemes for PDEs. 

The stability definition above is equivalent to discrete well-posedness, i.e., well- 
posedness of the discrete problems which is that for any h and for any vi, v 2 in V, 
\\Th(vi — t»2)|| < K\\v\ — v 2 \\ where K > is some constant. 

Theorem 3.3 (Equivalence Theorem for Linear Discrete Operators). A 

consistent family of operators Th is convergent if and only if it is stable. 

Proof. Suppose the family is convergent, i.e., lmih-,o ThV = Tv for every v G V. 
Hence ||T)jti|| < K(v) where K(v ) is a constant possibly depending on v. Since V is 
complete, by uniform boundedness principle we have uniform boundedness of Th, 
i.e., sup h ||T/j|| < K, for some K > 0, and hence stability. 

Conversely, suppose that Th is consistent and stable. Since Vq is a dense subspace 
of V, for a given v G V choose vo £ Vo such that 

ii ii e 

Because of consistency, there exists ho £ (0, 1) such that, for all h < ho we have 
that \\T h v - Tv \\ < e/3. Hence for all h < h 

\\T h v - Tv\\ < \\T h v - ThVo\\ + \\T h v - Tv \\ + \\Tv ~ Tv\\ 

< \\T h \\ \\v -v \\ + \\T h v - T«o|| + ||T|| ||vo - v || 

e e e 
^3 + 3 + 3 =£ - 

Therefore we have convergence. □ 

For the case when the discrete operators are nonlinear we give a different defini- 
tion of stability since uniform boundedness is unlikely. This allows us to prove an 
equivalence theorem in this case as well. Most of the examples we give in later sec- 
tions are linear ones. The proof of the equivalence theorem in the case of nonlinear 
discrete operators (Theorem 13. 5j) is different from the linear case. 

Let T n be a sequence of operators (not necessarily linear or continuous) from V 
to W, where V and W are normed linear. Let T : V — > W be a bounded linear 
operator. 

Definition 3.4 (Nonlinear Discrete Operator Case). The discrete operator T n is 
said to converge to T if, for any given v G V, the sequence T n v converges to Tv 
in W, i.e., for each v G V, given e > there exists n G N, such that for all n > no, 
\\T n v — Tv\\ < e. If there is a dense subspace Vo of V such that for any v in Vq, the 
sequence T n v converges to Tv in W then T n is said to be consistent with T. The 
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operator T n is stable at vq 6 V if for any e > there exists a S > 0, such that 
for each v € V with ||u — «o|| < 5, there exists no S N such that for all n > no, 
||T„u — T„uo|| < e. It is stable if it is stable at every vq e V. 

Remark 1 3 . 2 1 ab out convergence implying consistency that was made earlier about 
the linear case is also valid for the nonlinear case covered in Definition 13 . 41 since the 
consistency and convergence definitions are the same in both cases. 

Theorem 3.5 (Equivalence Theorem for Nonlinear Discrete Operators). 

A consistent family of operators T n is convergent if and only if it is stable. 

Proof. Suppose T n is convergent. This implies that given v, Vq in V and e > there 
exists n v ,n VQ £ N such that \\T n v — Tv\\ < e/3 for all n > n v and ||T n «o — Tvq\\ < 
e/3 for all n > n Vo . We need to find an n £ N and a 8 > such that for a given 
v e V with \\v — wo || < S, \\T n v — T n vo\\ < e for all n > hq. But, \\T n v — T n vo\\ — 
\\T n v -Tv + Tv- Tv + Tv - T n v \\ < \\T n v - Tw||+||Tt> - Tv Q \\ + \\Tv - T n v \\. 
Since T is bounded linear operator, if we choose 6 appropriately, then ||Tv — Tv \\ < 
\\T\\\\v — vq\\ < e/3. Letting ng = niax(n„,n„J we get ||T n u — T„wo|| < e for all 
n > tiq. Since v$ was arbitrary, T n is stable if it is convergent. 

Conversely, suppose we assume stability and consistency. We'll show convergence 
at vq. Thus we need to find an no € N such that ||T n i?o — Tvq\\ < e for all n > hq. 

Stability at vq means that there exists 6 > such that for each v' € Vq with 
\\ v 'a — "^o 1 1 < $ there exists n\ £ N with ||T„uo — T Tl u || < e/3 for all n > n%. 
Since T is bounded linear operator, if we choose S appropriately we can also make 
\\Tv' - Tv \\ < e/3. Choose such a 5 and v' e V . Note that that \\T n v - Tu || < 
\\T n v - T n v' \\ + \\T n v' - Tv'J + \\Tv' ~ Tv \\. By the choice of v' Q the first and last 
terms on the right hand side of the above inequality are already at most e/3. By 
consistency, there exists S N such that ||T n u — Tv' \\ < e/3 whenever n > ri2- 
Choose no = max(ni,n2). Then for all n > no we have ||T„«o — Tvo\\ < £• Hence 
we have convergence at vq. Since vq was arbitrary, we have that stability implies 
convergence. □ 

In the linear case above (Definition 13.11 and Theorem I3.3[) , we used a real pa- 
rameter h. Typically this will be a measure of size of discretization, such as the 
maximum distance between adjacent nodes in the partition of an interval. In the 
nonlinear case (Definition [ITU and Thcorem l3.5[) we chose to use a natural number n 
as the parameter. This might stand, for example, for the number of times sampling 
is done in Monte Carlo integration. This change from real h to natural number n 
was done to give both flavors of the definitions and proofs. Each can be written 
using either h or n. 

Remark 3.6. Note that in the proofs of both the equivalence theorems above, we did 
not assume consistency to show that convergence implies stability. It would have 
been redundant anyway, to assume consistency when we already have convergence 
(see Remark l3.2j) . 

Remark 3.7. In [5] (page 67) consistency, convergence and stability are defined in a 
general setting of linear operators. The problem setting is the solution of equation 
Lv = w, where L is a bounded linear operator. This is discretized as L n v n = w. 
Here the unknown is v and v n and w is known. Under their definitions they show 
one side of the equivalence theorem, that a consistent method is convergent if it is 
stable. Our setting however is that of "direct" problems. In our case the object 
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being approximated discretely is Tf which is approximated by Thf where / is 
known data. In [2] an inverse of the operator L is required. In our case T may go 
from the space of continuous functions to reals (as in Section [5]) so that an inverse 
may not exist. 



4. Numerical Integration 

As the first application of the ideas of Section [3] we now discuss the notions of 
consistency, convergence and stability of numerical integration. We only address 
definite integrals of continuous functions on the real line. The two most successful 
methods for numerical integration are quadrature rules and Monte Carlo integration 
and these are covered below in Sections 14.11 and 14.21 The definitions and proof of 
theorem for quadrature are identical to the linear case in Section [3J For Monte 
Carlo integration however, the notions of consistency, convergence and stability 
need to be put into a probabilistic setting and the equivalence theorem proof uses 
some probabilistic reasoning. Otherwise, the pattern of the proof follows that of 
Theorem 13.51 In the quadrature case we apply the theorem to infer convergence 
of Gaussian quadrature from a simple proof of its stability and we also discuss 
composite trapezoidal rule and the instability of Newton-Cotes quadrature. In the 
case of Monte Carlo integration we discuss the Sample Mean method as an example. 



4.1. Quadrature. The numerical approximation of definite integrals is often done 
using quadrature rules [ITj . Let V — (C[a,b], || • H^). For / 6 V define 1(f) 
I f(x)dx, which can be approximated by a sequence of quadratures I n (f) = 

X)"=o f( x< f^)i where a ^ Xq < x± < ■ ■ < xil^ ^ b is a partition of [a, b]. 
The points Xi are called nodes. These nodes are not necessarily equally spaced or 
progressive. (In progressive quadrature if the number of nodes is increased from ri\ 
to tt-2 then only ni — ri\ nodes are new.) The real linear functional / : V — * R is a 
bounded linear operator and ||/|| = b — a. 

Each I n : V — ► M is also a linear functional on V and 



In(f)\ = 


n 

E-i n) /(^) 


n 

<Ek (n) 






n 

^ll/llooEM" 3 




2=0 


i=0 




4=0 



Thus ||I n || = X)i"=o > an d so eac h In is al so a bounded linear functional. 

With these preliminaries, we can define the stability, consistency and convergence 
of quadrature rules in exactly the same way as was done in Definition 13. 11 



Definition 4.1. A quadrature rule /„ is said to converge to / if /„(/) — » /(/) for 
every / in V. It is consistent if it converges on a dense subspace of V and stable 
if sup„ ||J„|| < oo. 
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Remark 4.2. The motivation for the above definition of stability is the following. 



\I n (fl) -In(f 2 )\ = 



< 



i=0 
n 

E(n) 

i=0 
n 



i=0 



11/1 - h 



n) h (4 n) ) 



/a (a 



.(«) 



,(«) 



Thus sup„ x;r=o 



< oo gives discrete well-posedness, i.e., stability. 



Now we prove an equivalence theorem for quadrature whose statement and proof 
is exactly the same same as the general linear equivalence theorem proved earlier 
(Theorem 13.31) . We have decided to use the natural number n as a parameter here 
instead of the real parameter h of Theorem 13.31 because this is the natural setting 
for quadrature (n + 1 is the number of nodes). 

Theorem 4.3 (Equivalence Theorem for Quadrature). A consistent quadra- 
ture rule is convergent if and only if it is stable. 

Proof. Suppose /„ converges to /, i.e., I n (f) — > /(/) for every / in V. This implies 
that for any given / in V, the sequence {/«(/)} is bounded. Since each /„ is a 
bounded linear functional, we can apply the uniform boundcdness principle which 
gives us that sup n ||Jn|| < 00 which is the definition of stability. 

Conversely assume stability and consistency. By definition of consistency then 
In(f) —> 1(f) f° r au / m Vo where Vq is a dense subspace of V. Stability means 
that sup„ ||/„|| < oo. By the density of Vq in V, given / G V choose fo S Vq such 
that 

ll/-/o|U< 



3max{||/||,sup„ ||J n ||} 



Hence 



|7(/)-/n(/)||<||/(/)-/(/o) 

< PIIII/-/0L 

< 11/1111/ -ML 



WHfo) - Ufa) 
WHfa) - I n (fo)\ 
||/(/o)-I n (/o)| 



+ IIM/0) -/»(/)! 

+ KIIII/0-/IIOO 

+ sup||J n ||||/-/ | 



<3+||/(/o)-/„(/o)|| + 3. 



By consistency, there exists no £ N such that ||/(/o) — ^n(/o)|| _! e /3 for all n > uq. 
Hence — I n {f)\\ < e for all n > uq. Therefore I n (f) 1(f) for every / 

inV. □ 

Now we use the equivalence theorem to show convergence of Gaussian quadrature 
rules and of the composite trapezoidal rule, followed by the non convergence and 
instability of Newton-Cotes rules. 

Example 4.4 (Gaussian Quadrature). In Gaussian quadrature all the weights 



» 



> 0. Hence ||/„|| = ^ 



» 



Gaussian quadrature rule I n is exact for 



all polynomials of degree less or equal to 2n — 1, i.e., I n (p) = J p(x)dx for all 
polynomials p of degree less than or equal to 2n — 1. Since the space of such 
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polynomials is dense in V, Gaussian quadrature is consistent. Moreover, I n (l) — 

J2i=o = fa = b- a. Thus sup„ ||I„ || = sup„ J2i=o =b-a. Therefore 
Gaussian quadrature is stable. Then by the above theorem, it is convergent. 

Example 4.5 (Composite Trapezoidal Rule). The composite trapezoidal rule has 
non negative weights. Moreover, it is exact for all piecewise linear polynomials 
which is a dense subspace of V. Hence by the argument as in the above exam- 
ple, we have stability and consistency and therefore convergence of the composite 
trapezoidal rule. 

Example 4.6 (Newton-Cotes). Define /„ to be the Newton-Cotes quadrature rule. 
The nodes are are equally spaced in this quadrature rule and it integrates polyno- 
mials of a certain degree exactly. Thus I n is a consistent family by our definition. 
However it is not convergent. An example continuous function for which I n (f) 
does not converge to /(/) is the Runge's function f(x) = (1/1 + 25a; 2 ) in the 
interval [—1,1] (see [17], page 208). This function also appears in the polyno- 
mial interpolation section in Example 16.61 By Theorem 14.31 Newton-Cotes should 
also be unstable. Indeed it is known that in Newton-Cotes rules some of the 
weights have negative sign and that this leads to instability, i.e., it is known that 

sup„ ||In|| = sup„ Y%=o \ w i I = 00 ( see P a § e 350 of EE])- 

4.2. Monte Carlo Integration. For well behaved functions, i.e., functions with 
continuous derivatives, the deterministic quadrature rule is very efficient at least 
in one dimension. However, if the function fails to be well behaved or in the case 
of multidimensional integrals, other techniques can be competitive. In this section 
we will define convergence, consistency and stability for Monte Carlo integration. 
For simplicity, this is done for functions in C[a, b]. The notation and results from 
probability theory that are needed were reviewed in Section [21 

Let V = {C[a, 6], || • H^) and for an / G V, let J : V -> R be defined as /(/) = 
J f. Let (W, || • || ^z) denote the space all of bounded random variables defined on a 
probability space (fi, £,P), where H^H^/ = ess. sup we n |-X"(a;)|. Let M n : V — > W 
be a sequence of maps, not necessarily bounded linear. For a specific Monte Carlo 
integration method see Example 14.91 below. In that example the discrete operators 
M n are bounded linear. We have chosen to state and prove the equivalence theorem 
for Monte Carlo integration (Theorem I4.8[) without this assumption. This is done 
to illustrate how the nonlinear case proof of an equivalence theorem works in a 
probabilistic setting such as this. 

We can look upon I as a map from V to W by defining /(/) to be the constant 
random variable, i.e., /(/) : £1 — > M is defined as I(f)(oj) = 1(f) for all u> € O. 

Definition 4.7. A Monte Carlo integration is said to be convergent if for any 

given / G V, 



P 



1 



uefl: lim M n (f)(u) = I(f)(u) 

n — >oo 

It is consistent if there is a dense subspace Vq of V such that for any / G Vq, it is 
convergent. It is said to be stable at fo G V if for any e > there exists a 6 > 
such that for each / G V with \\f — foW^ < S, there exists no G N such that for all 
n > n , 

P [w G n : |M n (/)(«) - M n (/ )(w)| < e] = 1 . 
It stable if it is stable at every fo G V. 
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Now we state and prove an equivalence theorem for which the proof is similar to 
Theorem I3.5[ but with a probabilistic flavor. 

Theorem 4.8 (Equivalence Theorem for Monte Carlo Integration). A con- 
sistent Monte Carlo integration is convergent if and only if it is stable. 

Proof. Suppose it is convergent. To show stability, we need to find an no G N and 
S > such that for a given / G V with ||/ — fo\\ < <5, and for all n > no, we have 
P [us : \M n (f)(ui) — M n (fo)(ui)\ < e] = 1. Outside a set of P-measure in fi and 
for all n G N, we have the following inequality 

|M„(/)M - M n (fo)H\ < |M n (/)(w) - 7(/)(w)| 

+ |/(/)H - J(/o)M| + \I(fo)(w) - M„(/ )M| . 

By convergence, there are n\ and ri2 such that for all n > ni, 

P[w:|M„(/)( W )-/(/)( W )|<e/3] = l 1 

and for all n > n 2 , 

P [w : |M n (/ )(w) - /(/o)HI < e/3] - 1 . 

The integral operator is bounded, because if ||/ — /oll^ < S, then |7(/) — I(fo)\ < 
5(b — a). Since I(f)(u>) — 1(f) for all w G O and by the boundedness of the integral 
operator, for an appropriately chosen S > 0, we have |/(/) — I(fo)\ < e/3. Hence 

P [ W : |J(/)( W ) - Z(/o)(w)| = |J(/) - I(fo)\ < e/3] = 1 . 

Define the three sets 

!li = {^!l:fh: |M n (/)(w) - Z(/)(a;)| < e/3] = 1} 

f! 2 = {uj e n : P [w : |J(/)(w) - /(/o)HI < e/3] = 1} 

fi 3 - {uj e fi : P [w : |I(/o)(w) - M„(/ )(a;)| < e/3] = 1} . 

It is easy to see that 0,2 = ^. Let fi' = f] i=1 Qi- Let no = max(ni,n2). For all 
n > no, P(Oj) = 1 for 1 < i < 3. Since P is countably additive, measure of a 
countable union of sets of measure zero is zero. Therefore P(f2') = 1. Thus for all 
uo E £1', except for set of P-measure zero we have \M n (f)(uj) — M n (f )(uj)\ < e for 
all n > n . Hence P [u : \M n (f)(uj) — M„(/o)(w)| < e] = 1 for all n > n . Since /o 
was arbitrary the Monte Carlo integration is stable if it is convergent. 

Conversely, suppose we assume stability and consistency. Therefore, given /o 
and e > 0, there exists 6 > 0, such that for each / G V with ||/ — foW^ < S, there 
exists an no G N such that for all n > no, 

P [u G : |M B (/)(w) - M n (/ )(o;)| < e] = 1 . 

By the density of Vb in V, we can choose g G Vo such that ||p — /olloo < By 
the boundedness of I, for an appropriately chosen S, we have \I(g) — I(fo)\ < e/3. 
Again, outside a set of P-measure in f) and for all n G N, we have the following 
inequality 

|M„(/ )M - I(/o)(w)| < |M„(/o)H - M„(ff)(w)| 

+ |M n ( 5 )( W ) - 7( 9 )(o;)| + \I{g)(u) - 7(/ )(o;)| . 

Stability at /o means that there exists 5 > such that for each g G Vb with 
||.g — /olloo < # there exists m G N with 

P [w : |M n (/ )(w) - M n {g)(u)\ < e/3] = 1} , 
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for all n>n\. By consistency, there is an n 2 such that for all n > n 2 , 
P[oj:\M n (g)(uj)-I(g)(oj)\<e/3} = l}. 

Moreover, 

P[w:|J07)(w)-J(/o)(«)|<e/3] = l}. 
By an identical argument as in the previous half of the proof, on a subset O' of f2 
with P-measure one, we have 

P [u G n' : |M„(/ )(w) - /(/o)(w)| < e] = 1} , 

for all n > tiq — mait(ni,n 2 ). Hence we have convergence at /o. Since /o was 
arbitrary we have that stability implies convergence. □ 

As an example of numerical integration using Monte Carlo methods we will 
examine the Sample-Mean Monte Carlo method and discuss its convergence and 
stability. We prove both convergence and stability separately. An alternative would 
have been to prove consistency and one of the other two properties and infer the 
third from Theorem [ 



Example 4.9 (Sample-Mean Monte Carlo Method). Suppose we want to compute 
the approximate value of the integral / = f(x)dx. First, we choose any function 

g G C[a, b] with the property that g > and g = 1. Since g G C[a,b] and is 
positive, there exists m G R such that < m < g{x) in [a, b\. 

Then there exists some random variable X with range in [a, b], such that g{x) is 
the pdf of X[TB]. Then, consider the random variable Y — f(X)/g(X). Therefore, 

''" l[x) -g{x)dx= f'f I. 



E(Y) 



9(x)' 



Now, choose a sequence of independent and identically distributed random variables 
X = Xi,X2, ■ ■ -,X n , . . . Since /, g are continuous, they are Borel measurable. Since 
9 0; f Id i s Borel measurable. Hence the sequence of random variables Y = 
f(X)/g(X) = f(X 1 )/g(X 1 ^ = Y 1} Y 2 = f(X 2 )/g(X 2 ),...,Y n = f(X n )/g(X n ),... 
are independently and identically distributed by Theorem l2.7l Hence for the chosen 
pdf g, we can define M n : V — > W as 

Mn (/) = if M. 



By Theorem 1H 



1 

uj : lim — y Yi(oj) = I 

n — >oo fl — ^ 

i=l 



= 1 . 



Thus we have the convergence of M n . Since M„ is linear in /, we can check for the 
boundedness of this linear operator 



\M n \\ = sup 
ll/IU^ 1 



1 - 

n ^ — ' 



f(Xi) 



< 



1 " 



< 



1 



To get the estimate on the norm of M n we have used the fact that HM^OIloo' — 
1 1 h || , where h = f / g. This is true because 

||M^)IL' = ess.sup wen \h{X{oj))\ = inf{M : P[ui G fl : \h(X(uj))\ > M] = 0} , 

and hence the bound on h(X) is controlled by the bound on h. 



12 



JOHN JOSSEY AND ANIL N. HIRANI 



The bound on M n is independent of n and depends only on g. To exhibit stability 
we have to show that given e > there exists S > 0, no € N such that for all n > no 

P [u : |M»(/)(w) - M„(/o)H| < e] = 1 , 

whenever ||/ — /oll^ < 5. But outside a set of P-measure zero and for all n e N, 
we have 

|M„(/)H - Af„(/ )H| < \\M n (f) ~ W n (/o)IL, < ||M„|| ||/ - /oIL • 
Therefore, for an appropriately chosen 6 > and for all n S N, we have 
P [w : |M n (/)(w) - M n (/ )(o;)| < e] = 1 , 

hence stability. 

The random variables X n are not necessarily unique for a given g. However, it 
does not matter as the Sample Mean Monte Carlo method is stable and convergent 
if we choose a positive pdf g G C[a, b]. 

5. Numerical Differentiation 

If sufficiently differentiable functions are considered and a sum of sup norms on 
the function and its derivatives is used as the norm then smooth differentiation 
is bounded linear. Numerical differentiation can be posed as a parameterized col- 
lection of linear operators. If they are assumed to be bounded as well then the 
definitions and equivalence theorem from Section [3] apply here. We show in this 
section, and it is no surprise, that the equivalence theorem is actually not needed 
for proving convergence or stability for the usual finite difference formulas for the 
first derivative such as forward, backward and central difference formulas. We also 
give the example of the lowest order, 3 points finite difference formula for second 
derivative, for which equivalence theorem is also not required. For these and similar 
simple formulas, the proofs of stability and convergence can be done directly and 
independently of each other and are easy. 

Thus there is no practical benefit of an equivalence theorem in the context of 
such simple finite difference formulas for numerical differentiation in one dimen- 
sion. However the equivalence theorem might be of benefit in proving stability or 
convergence for formulas of high order accuracy for arbitrary derivatives on non 
equally spaced grids such as the finite difference formulas in [8l [9] . The mesh size 
parameter h appears as h k in the denominator of these and similar formulas and 
stability proofs might be tedious. Here k is the order of the derivative. We do not 
discuss these formulas further in this paper. 

For k > 1, define H/H^* := J2i=o ll/^lloo wnere denotes the i-th derivative 
of/. Let V k = (C k [a,b], || • || c *) and let W = (C[a,b], || • Consider D$\dW : 
V k W, where and are the discrete and smooth differentiation operators 
respectively. Here h £ (0,1), is a parameter of the discrete operators and is a 
measure of how far apart the points used in, say, a finite difference formula are. In 
the chosen norms, 

D (k) 

is a bounded linear operator. In addition, assume also that 
are bounded linear operators. 
The definitions of consistency, convergence and stability in Definition 13.11 can 
now be repeated in the context of numerical differentiation. 
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Definition 5.1. Numerical differentiation is said to be convergent if 



lim 



= 0, 



for every / 6 V k and it is consistent if it converges on a dense subspace of V k . It 
is said to be stable if \\D^ || < C where C is independent of the parameter h. 

Theorem 5.2 (Equivalence Theorem for Numerical Derivatives). A con- 
sistent finite difference scheme is convergent if and only if stable. 

Proof. The proof is identical to the proof of Theorem [331 with Th replaced by Dj^' 
and T by L>( fe ). □ 

Example 5.3 (Forward Difference). As an example we now consider the basic 
forward difference approximation to the first derivative which we will show to be 
stable and convergent. The equivalence theorem is not required in this case although 
using it would reduce the work required in proving stability and convergence. It 
is easy enough to prove convergence and stability separately in an elementary way. 
Consider a point x e [a, b — h]. Let h g (0, 1) such that x + h < b. We will show 
that the forward difference approximation to the first derivative 

f(x + h)-f(x) 



is both convergent and stable. First note that 





= sup 






ll/ll i=i 





sup sup 

||/|| cl =la:e[o,6-h] 

sup sup | f'(x 

\\f\\ c l=lxe[a,b-h\ 



f(x + h)- f(x) 



h 

0h)\ < 1, 



for some < 9 < 1. 
have 



Thus Dr^ is stable. Moreover, by Mean Value Theorem we 

f(x + h)- f(x) 



lim 



D (D f 



— lim sup 

^ ie[«,ii-ji] 



/'(*) 



lim sup \f'(x + 9h) - f{x)\ = , 

xe[a,b-h] 



which means that D h converges to The convergence and stability of the 

other commonly used finite difference schemes like backward difference and central 
difference schemes can be similarly proved. 

The stability proved above means that a slight perturbation of / 6 V 1 does 
not drastically change the computed numerical derivative. Note however that if 
the C 1 norm is replaced by the sup norm, i.e., if the space V 1 is replaced by 
(C 1 ^, b], || • Hoo), then the finite difference schemes above are highly sensitive to 
small perturbations, leading to instability. This can be seen in the following exam- 
ple. 

Example 5.4 (Unboundedness in sup Norm). Define a sequence gh(%) = sin(27ra;//i) 
in (C^O, 1], || • IU). Note that H^IL = 1. However, 

sin(27r(a; + h)/h) — s\x\(2ttx /h) 



sup 
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By Mean Value Theorem, 

sin(27r(x + h)/h) — sin(27rx//i) 



2ir 

— — cos(2Trc/h) . 



for some c € (0, 1 — h ) and so 



D 



sup 

ce(o,i-fc) 



2lT 

— cos(27rc//i) 



2tt 
~~h 



> for all /i e (0,1). 



This implies that ||D> 

Example 5.5 (Second Derivative). We now discuss the convergence and stability 
of the most basic finite difference operator for second derivative. We will show that 
the finite difference approximation for second derivative 

fix + h) - 2f(x) + f{x - h) 



h 2 

l(2) 



is both convergent and stable. The operators D y h ' are consistent and so it is actually 
enough to prove just stability or convergence due to Theorem 15.21 But as in the 
first derivative case, we will prove stability and convergence separately, since both 
are easy to prove. To prove stability, note that 

f(x + h)-2f(x)+f(x-h) 





= sup 




= sup sup 




ll/ll C 2=l 




00 ||/|| o2 =la:e[a+/i 1 6-/i] 



h 2 

By two applications of the Mean Value Theorem, we have that for some < a, (3 < 
1, the above is equal to 

f'(x + ah) - f'(x - 0h) 



sup sup 

||/|| a=lxe[o+/»,b-/»] 
-,(2) 



< sup 

ll/ll 3 = l 



iriL < i 



Hence the D h is stable. Moreover, once again by the Mean Value Theorem, for 
7 < 1 we have 

f(x + h)- 2f(x) + f{x - h) 



some < a, (3, 7 < 1 we have 
D^f-D^f 



lim 

h-*o 



= lim sup 

x€[a+h,b-h] 

— lim sup 

xe[a+h,b-h] 



h 2 

fix + ah) - f'ix-ph) 



f"{x) 



= lim sup \f"ix - Ph + jhia + fi)) -f"(x)\ = 0, 

h ^ Q xe[a+h,b-h] 

which means convergence. Again, if the C 2 norm is replaced by the C 1 norm or the 
sup norm, the operator D h fails to be stable. Examples similar to Example 15.41 
above can be constructed to exhibit the unboundedness of the operator Df^ 

6. Polynomial Interpolation 



(2) 



The interpolation problem is to find a function that takes on prescribed values 
at specified points. In one dimension the data is given as (a:i,yi) for i — 0, . . . ,n, 
with xq < x\ < ■ ■ ■ < x n , and we look for a function / : M. — > R called interpolating 
function such that f{xi) = yi for all i. In order to define the notions of consistency, 
convergence and stability it is convenient to pose the interpolation problem as in- 
terpolation of continuous functions. For any given data which is to be interpolated, 
there is the obvious unique piecewise linear continuous function that interpolates 
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the data. Stability with respect to changes in the given values or the locations 
of the values are both captured by stability with respect to changes in the piece- 
wise linear continuous function. We will only address interpolation of continuous 
functions by polynomials. It is easy to show the existence and uniqueness of the 
interpolant polynomial in the case of one dimensional interpolation, i.e., when the 
dimension of the domain of the function is one [ITJ [20] . 

Let {p n } be a sequence of polynomials of degree at most n, interpolating a 
function / in V = (C[a, b], H-H^) on a set of interpolation points , where a < 

4 n) < 4 n) < • ■ ■ < x ( n n) < b. We can define polynomial interpolation as a map 
P n : V — > P„ C V, where P„ is the space of polynomials of degree at most n and 
P n f is defined as the unique polynomial p € P„ which interpolates / at the given 
nodes. A basic fact about error in polynomial interpolation is given by the following 
lemma. 

Lemma 6.1. Let f G C^ n+1 '[a,b], and let a < xq < Xi < ■ ■ ■ < x n < b be a 
partition of [a, b] . Then for all x S (a, b), there exists £ £ (minjxo, x}, max{x„, x}) 
such that 

fix) - (p n f)( X ) = j y no* - *<) • 

Proof. The proof requires repeated use of Rolle's Theorem. See page 119 of [2] for 
details. □ 

The following Lemma RT21 shows that the P n operators are linear, and Lemma [6751 
shows that they are bounded. The corresponding smooth operator is the identity 
map which is bounded linear. Thus the linear case definitions of consistency, con- 
vergence and stability given in Definition 13.11 apply here as well. 

Lemma 6.2 (Linearity). The interpolation operator P n : V — > P„ C V is linear. 

Proof. Let P n {f) = Pi and P n (g) — P2, where f,g £ V and p\,p2 are the unique 
polynomials in P n such that f(xi) = p\(xi) and g(xi) = piixi) for < i < n. 
Hence (/ + g)(Xi) = (p± + P2){xi). By uniqueness of polynomial interpolation 
Pn(f + g) = Pi+P2- Hence P n (f + .<?) = P n f + Png- Similarly using uniqueness we 
can show that P n (cf) = cP n (f), for any cel. □ 

Lemma 6.3 (Boundedness). Each interpolation operator P n is bounded, i.e., \\P n \\ — 
sup|| / || oo=1 ||Pn(/)IL < K(n) where K[n) > 0. 

Proof. By Lemma 16.11 P n {f) — f, where / is any polynomial of degree < n. Hence 
P n maps onto P„. Let f m be a sequence in P~ 1 (0) converging to / in V. Since 
fm{xi) = for all < i < n and lim m ^.oo fm( x ) = f( x ) f° r au x S [ a i b], we have 
f(xi) = for all < i < n. Therefore / e P^ 1 (0). Hence the kernel of P n is closed. 
Therefore, by Lemma 12.21 P n is continuous and hence bounded. □ 

Definition 6.4. Interpolation is convergent if for any given / S V the sequence of 
interpolant polynomials P n f converges to / in V. It is consistent if there is a dense 
subspace Vq of V such that for any / in Vq, the sequence of interpolant polynomials 
P n f converges to / in V. Interpolation is said to be stable if sup„ ||P n || < oo. 
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If the function being interpolated is a polynomial then the interpolant is exact. 
This follows from Lemma T6. II Since polynomials are dense in V (Weierstrass Ap- 
proximation Theorem) this implies that interpolation of continuous functions by 
polynomials is consistent. 

We now prove an equivalence theorem for polynomial interpolation. In the no- 
tation of Theorem 13.31 the discrete operators T n of the theorem correspond to the 
polynomial interpolation operator P n and the smooth operator T corresponds to 
the identity operator. 

Theorem 6.5 (Equivalence Theorem for Interpolation). A consistent poly- 
nomial interpolation is convergent if and only if it is stable. 

Proof. The proof is identical to the one given for the general linear equivalence 
theorem (Theorem !3-3[) with discretization parameter h replaced by n. Thus the 
proof of the equivalence theorem for quadrature (Theorem 14. 3|) can be used here 
with appropriate modifications. □ 

Example 6.6 (Runge's Function). It is well known that for the Runge's function 
f(x) = 1/(1 + 25a; 2 ) in the interval [—1,1] the polynomial interpolants do not 
converge if the interpolation points are uniformly spaced. See for example [12]. As 
we noted earlier, interpolation of continuous functions by polynomials is consistent. 
However, the method is not stable if equispaced points are used for interpolation. 
In light of the equivalence theorem above (Theorem 16. 5[) this agrees with the lack 
of convergence. To see the lack of stability, let p(x) be any polynomial which 
is arbitrarily close to the Runge's function. Existence of p(x) is guaranteed by 
the Weierstrass approximation Theorem. Now, the interpolating polynomials for 
p{x) are exact as we increase the number of interpolating points. However, the 
interpolating polynomial for the Runge's function is not close to p(x). This shows 
that polynomial interpolation of continuous functions using equispaced points is 
not discretely well-posedness, i.e., it is not stable. 

Interpolating real valued functions whose domain has dimension greater than 
one is a much harder problem, unlike the one dimensional case where polynomial 
interpolant always exists and is unique. For example, in 2 variable interpolation, 
if data is specified at 3 points that are collinear, then there are infinitely many 
linear interpolants. The geometry of the point locations becomes an important 
factor in existence and uniqueness of the interpolant. For a result of this type in 
the 2 variable case see, for example, [14] . We need the existence and uniqueness of 
interpolant so that there is a well defined operator P n . 

There does exist a generalization, called Ciar let's error formula [5] [4] [10], of 
Lemma |6. II to the multivariable case. Moreover, multivariate polynomials are dense 
in the sup norm, in the space of multivariate continuous functions on compact 
subsets of R™ (Stone- Weierstrass Theorem) [7J. Thus when multivariate polynomial 
interpolation does exist and is unique, we get consistency as before. Consistency, 
stability and convergence can be defined as described in the univariate case and 
thus an equivalence theorem exists for the multivariate polynomial interpolation of 
continuous functions. Of course the conditions for stability and convergence are 
more complicated in the multivariate case and we do not address those here. 

In contrast with interpolation, approximation methods do not require agreement 
with data at specific points. For example one might seek a polynomial p such that 
mhipgp^ ||/ — pIIqo is attained. In this particular case one can show existence and 
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uniqueness. See for instance Theorem 7.5.6 in [7]. One can define the notion of 
convergence, consistency and stability for approximation problems in an identical 
fashion as for interpolation and prove a theorem like Theorem l6.5l In the particular 
case considered above, the approximation always converges (Theorem 7.6.1 of [7]) 
and hence is stable. 

7. Conclusions and Future Work 

We have shown that equivalence theorems can be proved in a general setting 
in numerical analysis. As in the classical Lax-Richtmyer equivalence theorem in 
PDEs, these theorems state that a consistent method is convergent if and only if it 
is stable. The notion of stability we used was that of discrete well posedness and we 
defined consistency to be convergence on a dense subspace. The discretizations of 
the smooth problems were considered to be linear or nonlinear operators on normed 
linear spaces and depending on some parameter which measures the discretization 
size. We showed that our general equivalence theorems require basic tools like 
uniform boundedness principle, triangle inequality, and density arguments. Con- 
vergence implying stability was always obtained independent of consistency. 

In this paper, we have studied stability with respect to perturbation of input 
data of the discrete operators. However, one could also study stability with respect 
to locations of points, shape of domain etc. One can also investigate equivalence 
theorems in other numerical analysis contexts. Some examples are, multidimen- 
sional quadrature rules, multidimensional Monte Carlo integration, optimization, 
eigenvalue problems in PDEs, formulas for numerical differentiation of any order 
and accuracy, such as those given in [9]. 

We defined consistency as convergence on a dense subspace. In our examples, 
many times the dense subspace was polynomials in continuous functions. It may 
be worthwhile to study if and how the choice of particular dense subspace affects 
the theory and applications of equivalence theorems. 
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